COMP4702 Assignment - Drosophila (Fruit Fly) Sex Classifier¶
Jack Barton SNo.46466293
Data Loading and Pre-processing¶
In [ ]:
import pandas as pd
from sklearn.impute import SimpleImputer
import seaborn as sns
import numpy as np
The data used was provided by: Hoffmann, Ary A.; Smith, Ailie; Griffin, Philippa C.; Hangartner, Sandra B. (2016). Data from: A collection of Australian Drosophila datasets on climate adaptation and species distributions [Dataset]. Dryad. https://doi.org/10.5061/dryad.k9c31
Load the data into a pandas DataFrame and visualize the data
In [ ]:
# Load the data into a pandas DataFrame
df = pd.read_csv("Data/83_Loeschcke_et_al_2000_Thorax_&_wing_traits_lab pops.csv")
# df = pd.read_csv("Data/84_Loeschcke_et_al_2000_Wing_traits_&_asymmetry_lab pops.csv")
# df = pd.read_csv("Data/85_Loeschcke_et_al_2000_Wing_asymmetry_lab_pops.csv")
# Print feature labels
print(df.columns.tolist())
# Plot the data against different features comparing the distribution of the 'Sex' feature
sns.pairplot(df, hue="Species")
sns.pairplot(df, hue="Population")
sns.pairplot(df, hue="Temperature")
sns.pairplot(df, hue="Sex")
['Species', 'Population', 'Latitude', 'Longitude', 'Year_start', 'Year_end', 'Temperature', 'Vial', 'Replicate', 'Sex', 'Thorax_length', 'l2', 'l3p', 'l3d', 'lpd', 'l3', 'w1', 'w2', 'w3', 'wing_loading']
Out[ ]:
<seaborn.axisgrid.PairGrid at 0x7f3bca269de0>